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LHC experiments have placed strong bounds on the production of supersymmetric 
colored particles (squarks and gluinos), under the assumption that all flavors of 
squarks are nearly degenerate. However, the current experimental constraints on 
stop squarks are much weaker, due to the smaller production cross section and 
difficult backgrounds. While light stops are motivated by naturalness arguments, 
it has been suggested that such particles become nearly impossible to detect near 
the limit where their mass is degenerate with the sum of the masses of their decay 
products. We show that this is not the case, and that searches based on missing 
transverse energy ($t) have significant reach for stop masses above 175 GeV, even 
in the degenerate limit. We consider direct pair production of stops, decaying to 
invisible LSPs and tops with either hadronic or semi-leptonic final states. Modest 
intrinsic differences in $ T are magnified by boosted kinematics and by shape analyses 
of $ T or suitably-chosen observables related to $ T . For these observables we show 
that the distributions of the relevant backgrounds and signals are well-described 
by simple analytic functions, in the kinematic regime where signal is enhanced. 
Shape analyses of ^ T -related distributions will allow the LHC experiments to place 
significantly improved bounds on stop squarks, even in scenarios where the stop-LSP 
mass difference is degenerate with the top mass. Assuming 20 fb _1 of luminosity at 
= 8 TeV, we conservatively estimate that experiments can exclude or discover 
degenerate stops with mass as large as ~ 360 GeV and 560 GeV for massless LSPs. 
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I. INTRODUCTION 

In the search for physics beyond the Standard Model (SM), the top sector holds unique 
significance. As the top quark has the largest Yukawa coupling to the Higgs, it contributes 
one of the largest loop corrections to the Higgs mass, exacerbating the Higgs naturalness 
problem. To avoid a large degree of tuning, we therefore expect a top partner [U [2] that is 
not too much heavier than the top itself, and can be considerably lighter than most other 
new physics states. 

In models of softly-broken supersymmetry (SUSY), this expectation is reinforced by the 
connection between electroweak symmetry breaking and soft SUSY breaking. In any such 
model, one can write an expression that relates the mass of the Standard Model Z boson 
to a linear combination of soft-breaking masses, together with the supersymmetric Higgsino 
mass parameter \i. This implies either that the soft-breaking mass parameters are not too 
far above the electroweak scale, or that the underlying high energy theory enforces relations 
among parameters that lead to cancellations in the effective low energy theory. However the 
latter option is itself strongly constrained by the renormalization group (RG) running of the 
soft-breaking SUSY parameters and SM parameters, which imply a complicated mapping 
from the high scale theory to the effective theory probed by experiments. The largest RG 
effects are related to the largest couplings, and again the top sector has unique importance. 
This implies that one or both of the stop squarks, the scalar superpartners of the top quark, 
are expected to be relatively light. 

In R-parity conserving SUSY the stop is not a good dark matter candidate, so we will 
neglect the possibility that the lightest stop is also the lightest superpartner (LSP), and 
assume that the actual LSP is a weakly interacting particle such as a gaugino. It is quite 
possible, however, that the lightest stop (ti) is the the next-to-lightest superpartner (NLSP). 
Because of imparity and charge conservation, stops are produced in pairs in hadronic col- 
lisons. Once produced, a stop will decay to the LSP plus SM particles, a decay that can be 
two body, three body, four body, or even more, depending upon the mass spectrum of the 
other superpartners whose off-shell couplings connect the stop to the LSP. 

In the LHC era, null results from searches for extensions to the Standard Model have 
excluded new strongly interacting particles with masses that in some cases exceed a TeV 
[31 Hj. While the LHC experimental searches have been inclusive, the resulting mass limits 
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vary according to the production cross section and decay properties of the new particles. In 
particular squark mass limits derived from LHC experiments often assume four flavors of 
degenerate squarks, with an additional two-fold degeneracy between the squark partners of 
left-handed and right-handed quarks. These limits obviously do not apply to a single light 
stop. Both ATLAS and CMS have begun to constrain models in which pair-produced gluinos 
decay via stop-top pairs [U [5] , but of course signals in this mode depend on the gluinos being 
kinematically accessible. Direct stop production has been constrained in the special case 
that both stops decay to a top and a neutralino, and the neutralino then decays to a gravitino 
and a Z; in this topology ATLAS excludes stops up to 240 — 330 GeV (depending on the 
neutralino mass) using 2.05 fb _1 [6]. 

In many models, including SUSY models (on which we focus our attentions), the top 
partner decays directly to a top and an undetected weakly-interacting particle (i.e. t —> 
tx)i leading to a final state with missing transverse energy ($ T ). Our analysis will focus 
exclusively on this possibility, which is the most generic. If there is a sufficiently light 
chargino then the decay t — >> b\ + becomes important, and we will consider this important 
case in a sequel to this report. Other special cases require more specialized consideration; for 
example light sleptons enhance stop decays with multilepton final states. For stops lighter 
than the top, decays could proceed either through an off-shell top (a possibility we will 
consider in this work), an off-shell chargino, or through a flavor-changing decay t cx[3- 
[TT] . The possibility of such very light stops is already constrained by Tevatron searches 
[T2l [13], but covering all of the remaining parameter space at the LHC is challenging (TH- 
US]. 

For stop pairs decaying via t —> t\ the current leading technique looks for excesses 
in ti + $ T with the top pair decaying into (semi-)leptonic final states [T71 [T5] . A recent 
study of the LHC reach suggests that semi-leptonic analysis could extend the bounds to 
750 — 800 GeV with 20 fb _1 , assuming that the lightest superpartner particle (LSP) is much 
less massive than the stop [T9] . For heavier stops, the existing searches can be improved 
by using boosted top-tagging [201422] . However, as this requires a large splitting between 
the mass of the stop and the top+LSP pair, it is ineffective when the stop pt is below 
~ 200 GeV. In Refs. [23], [24], it was estimated that, if updated to 1 fb _1 , LHC searches 
[251 [26] combined with previous searches at the Tevatron [27H3T] could exclude direct stop 
pair production decaying to light gravitinos for stop masses up to 180 GeV. New results from 
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stop searches with the full 2011 datasets are expected soon from ATLAS and CMS. Clearly, 
this is a search of considerable interest to the experimental and theoretical community; as a 
result, during the completion of this paper, we became aware of two additional theoretical 
groups working on the improving stop sensitivity at the LHC [32], [33] . 

Many theorists have considered the possibility that a light stop may be nearly degenerate 
in mass with the sum of the masses of its decay products. Some have even proposed that 
"degenerate" stops are a natural result of well-motivated SUSY models. For example in 
Ref. [34J an explicit model was presented with a nearly massless LSP and a lightest stop 
with mass 188 GeV. The literature on degenerate stops has so far assumed that mi—m x ~ m t 
implies that such particles are invisible to j£ T -based LHC searches, even if the stops have 
rather large production rates. This implicit no-go theorem is especially strong for stops 
decaying predominately via t —> where the stop pair signal mimics conventional tt 
production. Even away from the degenerate limit, semi-leptonic decay channels have the 
disadvantage that $ T from the LSPs has to compete with the $ T contributed by neutrinos 
from top decays. 

In this report we dispel this conventional pessimism about LHC detection of degenerate or 
nearly-degenerate stops, specifically for stops that are at least as heavy as the top quark. We 
present search techniques that are sensitive to the pair production of top partners decaying 
into tops and invisible particles, even in the case of exactly degenerate mass spectra. We 
consider both the semi-leptonic final state (isolated muon or electron plus hadronic jets plus 
$ T ) and the fully hadronic final state (jets + $t)- F° r the semi-leptonic case we assume a 
conventional lepton trigger, while for the hadronic final state we assume a four-jet trigger 
as already implemented by CMS and ATLAS [35] . 

Our first major observation is that the $ T distribution for stop pair production differs 
significantly from that of ti ) even in the case where the stops are degenerate. This follows 
from the fact that $ T) despite its calorimeter-centric origins, is a measurement of missing 
momentum, not missing energy, as well as the fact that stops and tops have a significant 
decay width. The resulting intrinsic differences in $ T for stops and tops are then magnified 
by boosted kinematics, taking advantage of the large phase space accessible to stop and top 
production at the LHC. 

Our second major observation is that even rather small differences in $ T or j5 T -related 
spectra can be detected using a shape analysis. For ^ T -based observables we show that, in 
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the kinematic regime where signal is enhanced, the distributions of the relevant backgrounds 
are well-described by simple analytic functions. This background-fitting technique is moti- 
vated by the CMS Razor searches [36H38] . which in 2010 and 2011 successfully implemented 
one and two-dimensional shape analyses into inclusive SUSY searches and a third-generation 
leptoquark search. The Razor searches are based on the Razor kinematic variables M# and 
i?, where R is related to the $ T fraction of the event [39]. Rather than attempt to reproduce 
the 2D Razor fitting techniques, our analysis focuses on simpler ID shape analyses. The 
success of the Razor both validates the realism of our basic approach, and suggests that the 
application of the CMS Razor to a degenerate stop search would result in equal or greater 
sensitivity than discussed here. 

In Section [TIJ we describe in detail our search strategy, focusing on the missing transverse 
momentum distribution of stop events, as well as the fitting of background distributions. In 



Section III, we use the results of the MadGraph, Pythia, and modified PGS4 simulation tools 
to demonstrate the reach of our technique for hadronic stop searches in the next year of 
LHC running. Finally, in Section [TV] we apply the shape analysis technique to a kinematic 
distribution related to $ T (Afjf) in semi-leptonic top decays. In our conclusions, we present 
the expected exclusion limit using the combination of these two orthogonal searches; as we 
will show, using the expected luminosity from the LHC in 2012 (20 fb _1 ) these shape analyses 
can potentially exclude stops up to 560 GeV when the LSP is very light, and up to 360 GeV 
when the sum of the top and LSP masses are degenerate with the stop. This constitutes 
a significant improvement over the current and projected bounds from the standard stop 
searches. 



II. $ T AND ^-RELATED METHODS 

As discussed above, the SUSY scenario we wish to consider is one in which the stop is 
considerably lighter than the other squarks (and gluino) and decays directly to a top and 
the LSP, which for concreteness we take to be a neutralino. In particular, we will consider 
the simplified model [40] of a single light stop squark (ti, henceforth t) which decays to a 
top and a neutralino. Since the stop is a colored scalar, its production is dominated by QCD 
processes and so is only very weakly sensitive to the details of the rest of the superpartner 
spectrum 
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FIG. 1: Stop pair production at y^s = 8 TeV, calculated at NLO using Prospino [H]. 

The rate and kinematics of the t t\ process are determined by two parameters: the 
stop mass, and the neutralino mass, m x . Once the stop decays, and the neutralino 
escapes the detector the only visible states in the signal events are the decay products of 
the tops. The only remaining indication of stop production is the missing transverse energy 
carried away by the LSPs. In Figure [T] we show the production cross section at NLO for 
LHC with yfs = 8 TeV. All other superpartner masses are set to 1 TeV, except for the 
neutralino. Clearly, the small production cross section for a single stop pair, combined with 
the lack of multiple observables to distinguish from background makes the search for stops 
challenging. 

However, the presence of intrinsic $ T is a handle that allows signal to be distinguished 
from backgrounds. The existence of $ T in stop events will not only affect the number of 
events with large $ T but also the distribution of these events. Furthermore, the background 
distribution of $ T can be well modeled using simple analytic functions, which can, in many 
cases be measured in high statistics control regions. Using the shape of the $ T distribution 
provides a powerful tool to distinguish signal and background, as we outline below. 

Background $ T shapes 

Since the signal contains the decay products of two top quarks, and intrinsic the 
largest SM backgrounds will come from it, QCD multi-jet production and W+ jets. Which 
of these processes dominates depends on the range of $ T and the mode of top decay. In 
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order to limit the source of non-LSP $ T (which would dilute the signal), we consider only 
fully hadronic top decay, for which our analysis applies a lepton veto; and semi-leptonic top 
decays, for which we require exactly one isolated lepton. 



For the case of fully hadronic tops (the full analysis of which is described in Section III), 
there are two major sources of $ T in top background events. The first is from detector mis- 
measurement of top events where both W's decay hadronically. The second case - which 
dominates at large $ T - is due to one or both of the Ws decaying into a r which in turn 
decays hadronically. In this case the v T present in the top decay provides an intrinsic source 
of SM $ T . Other sources of intrinsic $ T in hadronic SM events arise include neutrinos from 
heavy flavor decays, and events where one or both of the W's decay leptonically and all 
charged leptons in the decay are lost, either due to acceptances or detector effects. For the 
QCD background the dominant source of $ T is mis-measurement of the jets. 



For semi-leptonic top decays (detailed in Section IV), the sources of $ T are similar, 
although as we now require one charged lepton there will be more neutrinos (either from 
leptonic decays of the W or leptonic decays of a r). The W+ jets process is a relevant (but 
subdominant) background for the semileptonic analysis and here the $ T comes from the 
leptonic decay of the W. The main QCD contribution is from jets faking leptons, but the 
rate for this is low. In the background events with a leptonically decaying W, the transverse 
mass of the lepton and the $ T should lie below the W mass; there is however a tail above 
the W mass generated by events with a leptonic W and a hadronic r. As we will show in the 
next section, this arrangement of background $ T allows for a significant increase in signal 
over background by combining $ T with other kinematic information into a transverse mass 
variable Mjf . 



$ T and Mjf distributions in signal 

By looking in the hadronic channel with a lepton veto, the separation between events with 
intrinsic $ T (signal), and those with other sources of $ T (background), can be maximized. 
One might expect that the stop signal missing transverse energy would also be very small, 
especially when the masses of the LSP and stop are such that A = mi — (m t + m x ) « 0, 
making separation difficult. However, it is important to remember that the name 'missing 
transverse energy^ is a misnomer. It is not the transverse energy that is measured - rather the 
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detectors record transverse momentum. In the rest frame of the parent stop, the momentum 
of the LSP is 

y^m? — (m t + m x ) 2 ][rn~ — (m t — rn x ) 2 ] 

Q= 2^ • (1) 

For small splitting the missing momentum scales as Q « \/2fiA if A <C m x and Q « 

^/ A{2m x + A) if A ~ m x <C m t (here \i is the reduced mass of the neutralino-top system) . 

In all but the last case the scale of the missing momentum is enhanced above that of the 

small mass splitting, proportional only to the square root of the small mass scale. 

Even in the limit where the stop is completely degenerate with the top-neutralino system 

(A = 0), the decay will proceed through the stop (or top, though this possibility was 

neglected in the Monte Carlo methods used in this paper) being off-shell by an amount 

comparable to the width T. In this limit, where we assume the decay is still prompt, A 

should be replaced with V in the above expressions. Thus, for stops produced > 5 GeV off 

shell and m x > 50 GeV, we expect the LSPs to carry ^ 20 GeV of momentum each, in the 

rest frame of the top. 

The intrinsic $ T of the event is obtained from the vector sum of the LSP transeverse 
momenta in the lab frame. Each stop is not generically at rest in the lab frame, and is 
boosted with respect to the center-of-mass frame of the partonic collision. The presence of 
ISR activity also provides a transverse boost, and causes the tops and neutralinos resulting 
from the stop decays to not be back-to-back, increasing the $ T . Taking all these effects 
into account, we expect a harder distribution of $ T in stop pair events than in ti events, 
even for degenerate stops . This is confirmed by explicit simulation, as we will show. Note 
that our detailed simulations with MadGraph and Pythia use the matrix element for stop 
pair production plus an extra jet to more accurately model the effect on $ T of the stop pair 
recoiling against an extra energetic jet. 

For the case of semi-leptonic top decays, the background, as outlined above, also contains 
irreducible sources of $ T . However, in these cases there is a J^ T -related variable that distin- 
guishes signal and background: the transverse mass M]f defined below. Though the visible 
decay products are identical in signal and background, we can try to distinguish the two 
by considering the difference between the invisible components. For signal, the $ T consists 
of two LSPs and a neutrino, while for background, it comes predominantly from a single 
neutrino, which partners with the visible lepton to form a W boson. If we assume that all 



9 



events come from SM tt events, and thus that the neutrino pt is equal to the observed 
then we can attempt to reconstruct the z-component (up to a two-fold ambiguity) of the 
neutrino momentum, using the Vl^-mass as a constraint: 

v _ pj(M^ + 2p T e ■ j T ) ± E e y/ {Ml + 2p T * ■ j T f - 4(p*Q 2 4 
Pz ~ 2p e T ■ {2} 

Clearly, if the missing energy is either inaccurately measured or not due to a Vl^-induced 
neutrino, then this reconstruction will fail. One indication of such a failure would be if the 
quantity in the square root can be negative. Defining 

(M^f = 2(p e T $ T -p T i .i T ), (3) 

we can improve the signal over background ratio by restricting ourselves to the region Mjf > 
M w . This improvement arises because only mis-measurement and hadronic taus can drive 
Mjf into this regime for background, while for signal, the vector sum of two neutralinos 
and the neutrino can easily result in $ T that satisfies this constraint, even without mis- 
measurement. 



Shape analyses 

Experimental analyses, particularly at hadron colliders, have tended to shy away from 
modeling the shape of MET distributions. In final states dominated by jets, there is the 
complicated phenomenon of jet mis-measurement, or more generally the nonlinear response 
of the calorimetry used for the standard calorimeter-based reconstruction of MET. However, 
the ATLAS and CMS experiments have already demonstrated the ability to understand 
MET distributions in a variety of complex final states, and to simulate MET including the 
contributions to MET from imperfect detector response and reconstruction [42l 143] . Already 
in the 2010 LHC run, the Razor analysis at CMS demonstrated the usefulness of modeling 
MET-based observables for inclusive SUSY searches [44] , and a similar approach was applied 
in the 2011 run to a Razor search for relatively light third- generation leptoquarks [38]. The 
latter is especially relevant to the search for light stops, since it involved 6-tagging and 
was optimized for lighter particles producing weaker MET signals. These successful shape 
analyses in jet-dominated final states in LHC data validate that the basic approach pursued 
in this report can, with suitable modifications, be mapped into successful searches. 
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For semi-leptonic final states there is an even stronger track record of successful modeling 
of MET-based observables. In particular, the spectacularly precise determinations of the W 
boson mass by the CDF and DO experiments were based on modeling of the Mjf distribution 
in large lepton-triggered data sets [45]. For stop searches we require much less precision in 
the determination of the shape, and we are interested exclusively in modeling the tail above 
the Jacobian peak, rather than the peak itself. 

For our study we have relied on simulated samples, with events generated by 
MadGraph5 [46j, showered and hadronized by Pythia6 [47], and physics objects reconstructed 
using PGS [48] . The use of MadGraph allows us to simulate both SUSY signals and the tt 
background with extra partonic jets included in the matrix element. This adds essential 
realism both in that initial state radiation (ISR) effects are important when simulating de- 
generate stops, and because our baseline selection relies on counting jets. PGS has been 
shown to give reasonably accurate results for MET and other basic observables for the case 
of SUSY signals [49j [50] and, by extension, tt as long as one does not probe too far out in 
the tails of distributions. 

Accurate simulation of QCD multijet backgrounds and the MET associated with them is 
a more serious challenge, both because of the difficulty of generating samples with sufficient 
Monte Carlo statistics, and trusting features of such samples in a toy detector simulation 
after making very hard cuts. For our analysis we generated the equivalent of approximately 
2 fb _1 of QCD multijets. We used the loosest of several different baseline selections (all 
requiring 6-jets tagged to varying degrees of strictness) that seemed to give roughly com- 
parable sensitivity, with the idea that this makes our background modeling more reliable. 
While we have some confidence that our results agree at least qualitatively with distribu- 
tions obtained from LHC data, our simulated background samples should be considered as 
placeholders for data control samples in a real LHC analysis. 

For this study we simulated the three largest backgrounds: QCD multijets, tt+jets, and 
VK+j ets, but neglect the smaller contributions from Z/7*+jets, dibosons, single top, and 
ti+Z. 

Figure [2] (left) shows the background MET distributions that we obtain after our hadronic 



baseline selection (detailed in Section III). QCD multijets dominates for MET values below 
about 150 GeV, while tt dominates above. iy+jets and other backgrounds were found to 
have a negligible effect on the MET shape above about 40 GeV. Above 40 GeV, both the 



11 




$ T (GeV) $ T (GeV) 

FIG. 2: Left: Differential distribution of events for 20 fb _1 with respect to $, T of QCD (blue) 
and ti (green), and the total background (black) passing the hadronic trigger. The analytic fits to 
Eq. @ using the parameters in Table [i] are shown in red for QCD (dashed), tt (dotted) and their 
sum (solid). Right: Differential distribution of events corresponding to 20 fb _1 with respect to $ T 
for signal it — » tixx passing the hadronic trigger for a range of stop and LSP masses (m^ 3 m x ). 

QCD and ti backgrounds have MET distributions with a simple shape. Both shapes are well- 
described by the sum of two exponentials, a feature reminiscent of the kinematic shapes in 
the Razor analyses. The results of a fit (from RooFit [51]) in the MET range between 40 and 
400 GeV are shown in red in the figure. The MET distributions of the hadronic signals from 
light stops also have simple shapes, as illustrated in Figure 2 (right). As expected, while the 
signals suffer from lower cross sections compared to background, for MET exceeding <~ 100 
GeV they start to emerge as significant distortions of the MET shape. For degenerate stops 
the signal MET shapes have an exponential drop-off that is similar - but not identical to 
that of ti. 

One could employ a more traditional "cut and count" approach to the light stop analysis, 
but it is clear from Figure [2] that such an analysis would be complicated by the variety of 
different signal shapes and signal MET regions of interest. However, it can serve as a useful 
cross-check, and so (as we will show), we have performed a simple cut-and-count analysis 
for comparison to our shape analysis. An intermediate approach is to replace our analytic 
fits to the background shapes with a coarsely-binned analysis of MET yields; however given 
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FIG. 3: Differential distribution of ti events with respect to Mjf (black). The analytic fit (Eq. Q 
using the parameters of Table [TTJ) is shown in red. Also shown are the differential distributions 
of stop signal events with respect to Mjf for a range of stop and LSP masses. The semi-leptonic 
event selection is described in Section IIVI 

the simplicity of the background shapes it is not surprising that such a "poor-man's" shape 
analysis has less sensitivity when compared to the full shape analysis. 

For the semileptonic analysis, the variable of interest is M]f rather than J^ T ; specifically, 
we are interested in the shape of Mjf above the mass of the W, where background is 
reduced. Using a lepton trigger followed by a tight 6-tag, can significantly reduce ty+jets 
background in this range, leaving only ti as the dominant background (the full baseline 



selection is described in Section IV). Using the same event generation as in the hadronic 



case, we show in Figure [3] the distribution of ti background with respect to Mjf. Above 
Mw, this distribution can, like be fit with a pair of exponentials, greatly simplifying 
the shape analysis. Signal distributions for a representative sample of stop and LSP masses 
are also shown; as in the fully hadronic case, the shapes are sufficiently different to allow 
discrimination. 
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III. LHC SEARCH FOR HADRONIC STOPS 

In order to look for the effects of stops in the shape of $ T in hadronic events, we must 
first significantly reduce QCD background. We do this by applying a baseline selection 
based on an all-hadronic trigger, simplifying those developed by ATLAS and CMS for LHC 
running. We require at least two jets with p T > 80 GeV and at least two additional jets with 
Pt > 50 GeV, with a requirement of \rj\ < 3 for all jets. Of the jets with p T above 50 GeV, 
two must be tagged as 6-jets; at least one must pass a "tight" 6-tagging requirement, and 
the second must pass at least the "loose" requirement. Events that contain any electrons 
with p T > 20 GeV, \rj\ < 2.5 or any muons with p T > 20 GeV, \rj\ < 2.1 are vetoed (for our 
simulations, taus are treated as jets, thus forming a irreducible background that contains 
large $t)- 

To calculate the efficiencies with which tops, QCD, and signal stop events pass the trigger, 
we perform Monte Carlo simulation of the CMS detector; using MadGraph5/MadEvent to 
generate tt backgrounds and It signal events, matched to one additional jet. The t —> t\ 
branching ratio is set to 1, and top decay is handled by Pythia6. The top mass is assumed 
to be 175 GeV. Detector simulation is done by PGS4, modified to more closely match the 
reported CMS 6-tagging efficiencies for both "tight" and "loose" thresholds, as found in 
Ref. [52J. The top cross section at y/s = 8 TeV was calculated to be 226.9 pb using MCFM 
at NLO, while the stop cross sections were determined using Prospino [41], and are shown 
in Figure [Tj 

Using these simulations we find that the trigger has a ~ 7% pass efficiency for background 
tops, while the stop signal efficiency can vary from 2%-20% percent, depending on the mass 
splitting between the stops and the LSP (see Figure [3]). Larger splittings lead to more 
energetic tops in the decay, and so result in more high-p T jets and a higher trigger efficiency. 
We generated four-jet QCD background events in MadGraph, and allowed them to hadronize 
and shower through Pythia6, which produced a higher multiplicity of jets. The QCD 
total cross section and differential rates at LHC7 were compared to ATLAS experimental 
results [53] . and scaled to LHC8 by taking the ratio of Alpgen [54] partonic cross sections at 
LHC7 and LHC8. After application of our jet trigger selection and 6-tag requirements, we 
find that ~ 265 pb of QCD background remains. However, only 17% of these events have 
$ T above 40 GeV. 
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FIG. 4: Left: Signal trigger efficiency as a function of stop and LSP masses for hadronic event 
selection. Right: Signal cross section times trigger efficiencies as a function of stop and LSP masses. 
Like all such plots in this paper, the contours are extrapolated from a grid of Monte Carlo results 
with 5 — 25 GeV spacing in raj and m x . The degeneracy line (raj = rat + rax) is shown in black. 

As stated previously, assuming perfect detectors and no contamination from events with 
leptons (and thus neutrinos), the top and QCD backgrounds should have zero $ T . However, 
this is clearly not an assumption that survives contact with reality. Mismeasurement of jets, 
mis-tags of electrons and taus, and other experimental effects will all contribute non-zero 
$ T to the background. As we are limited to publicly available tools in our simulations, we 
cannot hope to exactly reproduce the $ T distribution in top events which will be observed 
by CMS and ATLAS. However, our PGS simulation of the detector (using the CMS detector 
geometry) will be sufficient to demonstrate the general behavior. 

In the left panel of Figure [2J we plot the $ T distribution for background events (in 
5 GeV bins) passing our trigger selection criteria, using an initial set of 60 million QCD 
MadGraph/Pythia/PGS events and 27 million top events. Two important features can be 
easily noticed. First, the $ T background peaks at ~ 20 GeV; this is at or below the intrinsic 
$ T value of stop events for all mass parameters of interest. Second, past the peak, each 
background is exponentially falling. We separately fit each background to a sum of two 
exponentials, 



da 
~d$j 



Ae 



+ Be 



(4) 



Due to limited statistics in the tail, and the complicated structure at low $ T , we only use 



15 





QCD 


tt 


a 


6.9 x 1CT 2 ± 1.56 x 1CT 3 
3.77 x 10" 2 ± 1.26 x 10" 3 


6.29 x 1CT 2 ± 1.63 x 1CT 3 
1.89 x 10 _2 ±1.57x 10" 4 



TABLE I: Best fit parameters for QCD and tt $ T distributions, fit to (Eq. ^ for an integrated 
luminosity of 20 /ft -1 . Note that these errors are correlated with each other and with the normal- 
izations (A, £>), which in turn depend on the amount of integrated luminosity considered. See text 
for details. 

this analytic fit over the range 40 < $ T < 400 GeV. Other choices for the fitting function 
are possible (such as a Gaussian or Cruijff function, combined with exponentials), and may 
increase the range over which the background may be modeled. However, this simple choice 
suffices for our purposes. The corresponding distributions for signal are shown in the right- 
hand panel of Figure [2] for a range of stop and LSP masses. For each signal point, we generate 
between 400,000 and one million matched stop pair events using MadGraph/Pythia/PGS. 
However, we do not attempt an analytic fit for signal. Notice that, for signal, the total $ T 
peaks at a higher value than the part on-level $ T does. This is due to the addition of jet 
mis-measurement in addition to the LSP momenta, which serves to increase the average $ T 
observed. 

Our analysis is based entirely on Monte Carlo (MC) simulated samples. As a result, in 
order to mimic the effects of statistical fluctuations one would expect to see in data, which 
will affect the precision of the fits, we carry out the fits outlined above on appropriately 
chosen samples of MC data. For tt we can generate in MC the number of events expected 
after 20 fb _1 of 8 TeV running and use this to extract the parameters. For QCD we cannot 
hope to generate sufficient MC, so instead we carry out the fit on the 60 million QCD events 
that we have. We then use this fit as an input to generate "pseudo-data" appropriate to 20 
fb _1 , and refit to the pseudo-data. This approach captures the uncertainty expected in the 
fit of real data. We show the best-fit slopes, and the associated errors, in Table [TJ However, 
note that there are sizable correlations between these fit parameters that need to be taken 
in to account when calculating the uncertainty on the fit. 

Although we are handicapped by having to rely on MC to determine the shape of the 
background distributions, the LHC collaborations do not suffer from this restriction, as they 
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are in possession of copious amounts of data. The QCD background to our signal contains 
two 6-tagged jets, mostly a light quark or charm quark faking a 6-quark, or from real b 
production. The complementary anti-b-tagged sample (4 jets above our cuts but with no 
6-tags), provides a clean sample of (predominantly) QCD events in which to measure the 
$ T distribution. However, in order to extrapolate the $ T distribution from this sample 
to the signal region the b mis-tag rate in QCD samples, as a function of jet p^, must 
be well understood. Through simulation we estimate that if this mis-tag rate is known to 
~ 20% accuracy, as a function of pt ) then the effects on the determination of the parameters 
describing the QCD background are within our present uncertainties. This is encouraging 
for a data-based analysis. The tt background is harder to determine from data alone, but 
this issue is beyond the scope of our discussion. 



Maximum likelihood method 

In order to estimate the potential for 20 fb _1 of LHC8 data to exclude or observe the stop 
simplified model at a particular parameter point (raj, m x ), we must have some measure of the 
difference between signal and background $ T curves. The measure we employ is hypothesis 
testing with profiled likelihoods [55] . In this approach one calculates likelihoods assuming 
the observed data is the result of a particular hypothesis, maximizing the likelihoods over 
"nuisance" parameters, which in our case are the 8 parameters of the fits to the background 
$ T shapes. We account for the known correlated uncertainties in the fit parameters by 
introducing Gaussian penalty terms into the definition of the likelihoods. 

Since the above procedure requires access to data, we instead ask the question of how well 
the experiments can expect to do if the data they observe is due to a particular model. There 
are two natural hypotheses that we can make for what the LHC may see: a) there is no light 
stop and the only production mechanisms are from the SM, or b) there is a light stop and 
the production cross section is as predicted in the MSSM. 1 To calculate the likelihoods for 
these two hypotheses, we can take advantage of our background analytic function as well as 
the shape of the distribution of signal events, determined from MC, to generate pseudo-data 

1 There is clearly a continuum of possibilities: that there is a light stop and neutralino but the production 
cross section is different from what is predicted in the MSSM. Carrying out a full scan in the stop 
production cross section is beyond the reach of this paper. 
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which contains within it an equivalent amount of statistical fluctuation as 20 fb 1 of actual 
data. We generate this pseudo-data using the central values of the best fit parameters found 



in Table II). We then attempt to fit this pseudo-data to both the SM only hypothesis and 
the SM+stop hypothesis. 

The log likelihood, including the constraint associated with the Gaussian uncertainties 
on the background fit parameters, q, is given by 

log L(ci, a) = ~v{ c h °) + n + nlog y ^ J ~ \ ^ " 5 p) C m ( c <? ~ ^) ' ( 5 ) 

bins ^ ' pq 

where v is the predicted number of events in a bin, n is the observed number of events in a 
bin for a particular set of pseudo-data, q is the central value of the i th fit parameter, and 
is the covariance matrix of those fit parameters. The second summation term in Eq. [5] is a 
constraint in the maximization, coming from assuming the uncertainties in the parameters 
of the background fit are Gaussian in nature. We allow the eight parameters involved in the 
background fits (four normalizations and four slopes) to vary within their uncertainties and 
maximize the log likelihood over these parameters and the signal production cross-section, a. 
That is, for the SM only hypothesis, we maximize logL over q and a, and for the SM+stop 
hypothesis, we fix a to the NLO expectation, a*, and maximize logL over q. Since the 
pseudo-data was generated under the SM only hypothesis, a ~ in all cases. 
As our test statistic we use twice the difference between these two values, 

2A log L = 2 log L(c h a) -2 log L(q, a*) , (6) 

which for clarity we convert into a number of standard deviations n a = ^/2AlogL. This 
n a measures the incompatibility of the SM+stop versus SM only profiled likelihoods. We 
repeat this process 200 times to obtain the average sensitivity. 

In addition to the profile likelihood method described above we also investigate the sensi- 
tivity along the "degeneracy line" {mi — m x = m t ) using the CLs method [56], [57]. We do so 
by generating 10 4 pseudo experiments under both background only and signal+background 
hypotheses and then use these pseudo experiments to determine the expected exclusion of 
signal, for an observation consistent with background. Since the CLs method requires a high 
statistics sample of pseudo experiments we did not calculate the bounds for stop masses be- 
low r^j 230 GeV. For the median expected exclusion we assume that the log likelihood ratio 
of the observed data falls at the median of the background-only distribution. For the one 
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m~ t (GeV) m~ t (GeV) 

FIG. 5: Expected sensitivity, in standard deviations, for the hadronic $ T shape analysis as a 
function of the stop and LSP masses. The test statistic is computed from 200 pseudo-experiments 
of 20 fb _1 . In the left-hand plot the uncertainty on the background $t shape are as shown in 
Table [I] and in the right-hand plot these errors have been inflated by a factor of 3. 

sigma CLs band we assume the data falls above/below the background median value by one 
sigma, and similarly for the two sigma band. 

Estimated Hadronic Stop Bounds 

Using these statistical methods, in Figure [5] we show the estimated significances extracted 
from our test statistic for light stop simplified models when the top decays hadronically. 

We estimate that for simplified models in which the stop/neutralino mass splitting is 
large, the LHC experiments can set strong stop mass limits up to ~ 550 GeV. In the case 
of a very light neutralino the reach is determined simply by the production cross section of 
the stops, which drops rapidly with the mass (Figure [T]), although there is some softening 
of this behavior due to increased efficiency to pass the cuts as the stop mass is increased 
(Figure [4]). 

Most interestingly, even along the mass-degeneracy line of mi — m x = m t) stops of mass 
as high as ~ 350 GeV could be excluded with 20 fb _1 of 8 TeV data. In fact we find that 
the sensitivity reach extends above the degeneracy line into regions where the stops decay 
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FIG. 6: Left: S/y/O for 100-200 GeV region of signal plus background $ T distribution. Right: 
S/y/O for 200 — 400 GeV region. These are computed with an unrealistic assumption of no 
systematics. 

into off-shell tops. 

As an additional cross-check, we perform a simple cut-and-count analysis of the signal 
parameter points, dividing the $ T range of 40-400 GeV into three regions: our "background" 
region of 40-100 GeV; and two signal regions; 100-200 GeV and 200-400 GeV. Iterating 
over 200 pseudo-experiments generating $ T distributions of signal plus background events, 
we assume that all events in the background region are ascribable to the QCD and top 
backgrounds. This sets our overall normalization, which we use to predict (using our analytic 
fit Eq. Q) the number of the background events in our two signal regions. For each pseudo- 
experiment, we can then calculate the number of signal events S in each signal region as the 
difference between the observed events O and the predicted value P. In Figure [6j we plot 
the average value of S/y/O for both the \ow-$ T an d high-j5 T signal regions. Addition of a 
realistic systematic error to the predicted number of events will reduce the sensitivity of the 
cut and count method. For a stop mass of 250 GeV and LSP of 5 GeV one has, with zero 
systematics, 9a sensitivity, while a 5% systematic reduces this to approximately 2a. 

The shape analysis we are advocating allows for many of the backgrounds to be deter- 
mined from control regions in data and thus removes many systematic uncertainties as- 
sociated with theoretical predictions of background 2 . There are systematic uncertainties 



2 There still exists a difficult-to-quantify systematic error associated with the choice of functional form the 
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m~ t (GeV) 

FIG. 7: The median expected exclusion, for background only pseudo experiments, on a stop- 
neutralino simplified model whose masses are related by — m x = m t . 

associated with extrapolation from control regions to signal regions, such as the b-tagging 
rates discussed above, but we estimate that these are subdominant to the fit uncertainties. 
We have, with the exception of QCD, fit to simulated data sets that are consistent with 
what one would expect after luminosity of 20 fb _1 and find the errors as shown in Table [TJ 
However, since our analysis is entirely MC based, and it is possible that the real control re- 
gions will contain limited statistics, we also investigate how the sensitivity is affected if the 
errors in our fit parameters are inflated. In particular we consider the situation where the 
central values for the fit parameters are as shown in Table [T] but the errors are a factor of 3 
or 5 times larger. Assuming that the errors from extrapolation are then subdominant to the 
fit uncertainty, we keep the correlations between the fit parameters as we inflate the errors. 
With an inflation by 3 the fractional errors in the fit parameters range from a few to 17% 
and inflation by 5 has a largest error of 30%, with the largest errors in the normalizations, 
as expected. The effects of this inflation, for 3x, are shown in Figure [5| An inflation by 5x 
degrades the sensitivity as one moves towards degeneracy: along the degeneracy line the 2a 
exclusion extends to 260 GeV. The 2a exclusion for case of light neutralino is not greatly 
altered from the bound for 3x inflation. 

Focusing on the degeneracy line (m^ — m x = m t ), a region of particular interest and 
considerable challenge, we apply the CLs method as outlined above. The median expected 



background distributions are fit to. A discussion of these effects is beyond the scope of this paper. 
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FIG. 8: Left: Semi-leptonic trigger efficiency for semi-leptonic events as a function of stop and LSP 
masses. Right: Cross section times efficiency for the semi-leptonic selection criteria as a function 
of stop and LSP masses. 



exclusion, as well as one and two sigma bands, on such a degenerate stop-neutralino pair is 
shown in Figure [7| Using the CLs method, stop masses up to 375 GeV can be excluded at 
2d when m t ~ — m x = m t . 



IV. STUDY OF SHAPES 



We now turn from stops with fully hadronic decays of top to the semi-leptonic channel, 



discussed briefly in Section [II} In this case the dominant background is tt. Given that semi- 
leptonic ti decays have an intrinsic source of missing transverse energy from the neutrinos 
coming from the W decays, $ T offers poorer discrimination between signal and background, 
as compared to the hadronic case. We therefore focus instead on the transverse mass variable 
Mjf defined in Equation |3j This variable is related to $ T) but has the additional feature 
that SM background $ T from a single leptonic W decay is mostly distributed below the 
Jacobian peak near the W mass. 

Our method follows the hadronic analysis closely. Again assuming stop pair production, 
each decaying to a top and an LSP, we now look for events where one top decays leptonically, 
while the other decays hadronically. We select events with exactly one isolated lepton with 
Pt > 20 GeV and \rj\ < 2.1(2.5) for muons (electrons), at least one tight 6-tagged jet, and 
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requiring three or more jets with pt > 30 GeV and \rj\ < 3. The primary background is 
reduced to ti ) with an acceptance efficiency of ^ 15% (including branching ratios). The 
efficiencies and cross section times efficiencies for the stop/LSP signal points are shown in 
Figure [8| 

Focusing on M^f above Mw will improve the discrimination of stops from tops. Applying 
a shape analysis as was done in the hadronic $ T case will provide even greater advantages. 
The total SM background distribution for Mjf > 85 GeV can again be well fit by the sum 
of two exponentials: 



da 
dMf 



= Ae~ aM T + Be~ pM T . 



(7) 



Repeating the search strategy performed in the hadronic analysis, we use RooFit to find 
the best fit for the parameters in the M^f range of 85 — 400 GeV (see Table II), weighting 



the top background to the equivalent of 20 fb _1 of data. Again, the fit errors reported are 
highly correlated. 

Using this fit and the associated errors, we repeat the profile likelihood analysis de- 
scribed previously, testing the background versus signal plus background hypotheses over 
200 background- generated pseudo-experiments for each simplified model point. Our results 
are shown in Figure |9j for both the full profile likelihood analysis including all errors, and 
the case of errors inflated by a factor of 3. The sensitivity is similar to that obtained for the 
hadronic analysis. In Figure [lOj we perform a cross-check using the cut-and-count method, 
with a background bin between 85 — 150 GeV used for normalization, a low signal bin be- 
tween 150 — 250 GeV, and a high signal bin between 250 — 400 GeV. As before, this simple 
analysis both validates and provides motivation for the full shape analysis. 





fit to 20 fb 


1 total SM background 


a 


6.68 x 


10" 2 ± 6.88 x 10" 4 




2.01 x 


10" 2 ± 3.04 x 10" 3 



TABLE II: Best fit slope parameters for background Mjf distribution, fit to (Eq. [7]). Note that the 
fit errors are correlated with each other and with the normalizations (A, £>), which in turn depend 
on the amount of integrated luminosity considered. 



23 




m~ t (GeV) m~ t (GeV) 

FIG. 9: Expected number of standard deviations that the supersymmetric stop signal can be 
excluded by using 200 pseudo-experiments of 20 fb _1 , applying the Mjf shape analysis. In the 
left-hand plot the uncertainty on the background $t shape are as shown in Table [n] and in the 
right-hand plot these errors have been inflated by a factor of 3. 

V. CONCLUSION 

Third generation squarks are an integral part of the supersymmetric solution to the 
naturalness and hierarchy problems. More generally, the large Yukawa couplings between 
the top and the Higgs hint at some connection between the third generation and electroweak 
symmetry breaking. Improving the search techniques for stop squarks (and more generally, 
top partners) at the LHC is therefore of great theoretical and experimental interest. In 
this paper, we have demonstrated that a dedicated search for stop pairs in hadronic and 
semi-leptonic channels has the potential to improve the current limits, especially for mass 
values such that the stop and the LSP + top quark system are nearly degenerate. 

We see that the tranverse momentum in the lab frame produced by the LSPs in stop pair 
decays is larger than naive expectations. Thus hadronic searches that limit the contribution 
to $ T from Standard Model neutrinos can provide significant discrimination between signal 
and background. The most obvious way to access this kinematic information is by modeling 
the shapes of $ T distributions for the most relevant SM backgrounds. As we have shown, 
such an analysis is capable of excluding stops up to ~ 250 GeV in the degenerate case, 
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FIG. 10: Left: S/y/O for 150-250 GeV region of signal plus background Mjf distribution. Right: 
S/VO for 250 — 400 GeV region. These are computed with an unrealistic assumption of no 
systematics. 

as compared to up to 550 GeV when the LSP is light. However, we expect that other 
^ T -based variables could also serve. For the semi-leptonic stop search, we saw that the 
most straightforward approach is to model the shape of the transverse mass variable Mjf , 
which is related to $ T . We found that the projected sensitivity to degenerate stops in the 
semi-leptonic case also reaches up to ~ 300 GeV, similar to that in the hadronic channel. 
Finally, since these two channels are independent, we combine these bounds which we show 



in Figure [TT] The resulting exclusion for light neutralinos is 560 GeV and 360 GeV in the 
degenerate case. 

We note that the CMS Razor analyses [36H39] access the missing transverse momentum 
of an event through the transverse Razor variable (and through this, the Razor ratio 
R). As such, one would expect that Razor inclusive searches could be competitive with a 
more targeted analysis using the techniques outlined in this report. More generally, our 
$ T search could be upgraded to a multi-dimensional shape analysis as used in the Razor. 
Though, in this theoretical work, the analytic fits for the $ T distributions were drawn 
from Monte Carlo simulation, the experimental collaborations can use data control samples 
to model the background shapes. In the real experimental analyses the optimal baseline 
selections in both the hadronic and semi-leptonic channels could differ from those presented 
here. Furthermore, we have shown that even if the extraction of the fit parameters from 
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FIG. 11: Expected sensitivity, in standard deviations, to SUSY stop signals using a combination 
of $ T and Mjf shape analyses, included all fitting errors in the maximum likelihood method. 

data suffers from considerably more uncertainty than our Monte Carlo based analysis the 
shape-based approach, unlike a cut and count, still has good reach. 

Our results support the assertion that it is not possible for stop squarks lighter than <~ 1 
TeV in imparity conserving SUSY to elude LHC searches over the long run. A stop discovery 
would be at least as fundamentally important as a Higgs discovery, while complete exclusion 
of stops with mass lighter than a TeV would be a significant blow to our understanding of 
the connection between supersymmetry and electroweak symmetry breaking. 
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